05. Unclean Data: Dirty vs. Messy 2

Unclean Data: Dirty vs. Messy

There are two types of unclean data:

  • Dirty data, also known as low quality data. Low quality data has content issues .
  • Messy data, also known as untidy data. Untidy data has structural issues .

In this lesson, you are going to assess both dirty and messy data. Your job right now is to start to distinguish between those two now, even though quality and tidiness (the latter, especially) may not be 100% solidified in your mind yet.

Answer the following quizzes, distinguishing between low quality and untidy data, to set yourself up for success in this lesson.

Note: the data pictured in the animation is a simplified version of the actual dataset used in this lesson.

Quiz 1

Use the image below to answer the first quiz.

Bedroom: Dirty vs. Messy

QUIZ QUESTION: :

Which of the following cleanliness issues in a bedroom are dirty? Which are messy? Please match accordingly.

ANSWER CHOICES:



Cleanliness Issue

Dirty or Messy?

Messy

Messy

Dirty

Dirty

SOLUTION:

Cleanliness Issue

Dirty or Messy?

Messy

Messy

Messy

Messy

Dirty

Dirty

Dirty

Dirty

Quiz 2

Use the image below to answer the second quiz.

Data: Quality vs. Tidiness

QUIZ QUESTION: :

Which of the following cleanliness issues in the pictured dataset are quality issues? Which are tidiness issues? Please match accordingly.

ANSWER CHOICES:



Cleanliness Issue

Quality or Tidiness Issue?

Quality

Tidiness

Tidiness

Quality

SOLUTION:

Cleanliness Issue

Quality or Tidiness Issue?

Quality

Quality

Tidiness

Tidiness

Tidiness

Tidiness

Quality

Quality